| Stage | Phase A | Phase B | LR Sweep Total | Main Training | Stage Total |
|---|---|---|---|---|---|
| Stage 1 | 00:06:58 | 00:17:29 | 00:24:28 | 00:08:24 | 00:32:53 |
| Stage 2 | 00:10:35 | 00:23:41 | 00:34:16 | 00:06:39 | 00:40:56 |
| Stage 3 | 00:10:49 | 00:27:22 | 00:38:12 | 00:05:58 | 00:44:11 |
| Stage 4 | 00:10:50 | 00:27:31 | 00:38:21 | 00:06:00 | 00:44:22 |
| Stage 5 | 00:09:15 | 00:15:21 | 00:24:36 | 00:07:10 | 00:31:47 |
| Stage 6 | 00:10:38 | 00:31:14 | 00:41:52 | 00:05:26 | 00:47:20 |
| Stage 7 | 00:10:41 | 00:27:42 | 00:38:23 | 00:06:07 | 00:44:31 |
| Stage 8 | 00:10:52 | 00:31:16 | 00:42:08 | 00:05:00 | 00:47:10 |
| Stage 9 | 00:10:50 | 00:26:27 | 00:37:18 | 00:06:02 | 00:43:20 |
| Stage 10 | 00:10:48 | 00:31:16 | 00:42:04 | 00:05:18 | 00:47:23 |
| TOTAL | 01:42:20 | 04:19:23 | 06:01:44 | 01:02:08 | 07:03:57 |
Selected LR: 1.29e-03
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Survivor |
|---|---|---|---|---|---|
| 1.29e-03 | 0.047941 | 0.047941 | ±0.000000 | 1 | ✓ |
| 4.64e-04 | 0.078345 | 0.078345 | ±0.000000 | 1 | ✓ |
| 1.67e-04 | 0.090350 | 0.090350 | ±0.000000 | 1 | ✓ |
| 5.99e-05 | 0.102351 | 0.102351 | ±0.000000 | 1 | ✓ |
| 2.15e-05 | 0.104585 | 0.104585 | ±0.000000 | 1 | ✓ |
| 7.74e-06 | 0.107684 | 0.107684 | ±0.000000 | 1 | |
| 2.78e-06 | 0.137714 | 0.137714 | ±0.000000 | 1 | |
| 1.00e-06 | 0.661716 | 0.661716 | ±0.000000 | 1 | |
| 3.59e-03 | N/A | N/A | ±0.000000 | 1 | |
| 1.00e-02 | N/A | N/A | ±0.000000 | 1 |
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Winner |
|---|---|---|---|---|---|
| 1.29e-03 | 0.045373 | 0.049160 | ±0.012787 | 3 | ★ |
| 4.64e-04 | 0.057508 | 0.054081 | ±0.013401 | 3 | |
| 1.67e-04 | 0.072378 | 0.061073 | ±0.020469 | 3 | |
| 5.99e-05 | 0.092992 | 0.083679 | ±0.014384 | 3 | |
| 2.15e-05 | 0.097570 | 0.092876 | ±0.007330 | 3 |
Selected LR: 5.99e-05
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Survivor |
|---|---|---|---|---|---|
| 5.99e-05 | 0.016384 | 0.016384 | ±0.000000 | 1 | ✓ |
| 1.29e-03 | 0.021687 | 0.021687 | ±0.000000 | 1 | ✓ |
| 1.00e-06 | 0.023338 | 0.023338 | ±0.000000 | 1 | ✓ |
| 2.78e-06 | 0.023398 | 0.023398 | ±0.000000 | 1 | ✓ |
| 7.74e-06 | 0.023508 | 0.023508 | ±0.000000 | 1 | ✓ |
| 2.15e-05 | 0.023798 | 0.023798 | ±0.000000 | 1 | |
| 3.59e-03 | 0.036846 | 0.036846 | ±0.000000 | 1 | |
| 1.67e-04 | 0.041882 | 0.041882 | ±0.000000 | 1 | |
| 4.64e-04 | 0.044375 | 0.044375 | ±0.000000 | 1 | |
| 1.00e-02 | 0.083248 | 0.083248 | ±0.000000 | 1 |
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Winner |
|---|---|---|---|---|---|
| 5.99e-05 | 0.020386 | 0.019843 | ±0.006952 | 3 | ★ |
| 1.00e-06 | 0.021527 | 0.032360 | ±0.015342 | 3 | |
| 7.74e-06 | 0.022097 | 0.026730 | ±0.006613 | 3 | |
| 2.78e-06 | 0.022698 | 0.028665 | ±0.008805 | 3 | |
| 1.29e-03 | 0.035063 | 0.034528 | ±0.012684 | 3 |
Selected LR: 1.67e-04
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Survivor |
|---|---|---|---|---|---|
| 4.64e-04 | 0.017052 | 0.017052 | ±0.000000 | 1 | ✓ |
| 1.00e-06 | 0.021175 | 0.021175 | ±0.000000 | 1 | ✓ |
| 1.67e-04 | 0.023392 | 0.023392 | ±0.000000 | 1 | ✓ |
| 5.99e-05 | 0.025035 | 0.025035 | ±0.000000 | 1 | ✓ |
| 2.78e-06 | 0.026294 | 0.026294 | ±0.000000 | 1 | ✓ |
| 1.29e-03 | 0.027072 | 0.027072 | ±0.000000 | 1 | |
| 2.15e-05 | 0.028195 | 0.028195 | ±0.000000 | 1 | |
| 7.74e-06 | 0.029242 | 0.029242 | ±0.000000 | 1 | |
| 3.59e-03 | 0.038575 | 0.038575 | ±0.000000 | 1 | |
| 1.00e-02 | 0.062075 | 0.062075 | ±0.000000 | 1 |
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Winner |
|---|---|---|---|---|---|
| 1.67e-04 | 0.017966 | 0.019587 | ±0.005641 | 3 | ★ |
| 5.99e-05 | 0.021267 | 0.021211 | ±0.003164 | 3 | |
| 2.78e-06 | 0.027245 | 0.027899 | ±0.001644 | 3 | |
| 1.00e-06 | 0.027737 | 0.026187 | ±0.003629 | 3 | |
| 4.64e-04 | 0.029821 | 0.026080 | ±0.005431 | 3 |
Selected LR: 1.67e-04
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Survivor |
|---|---|---|---|---|---|
| 1.00e-06 | 0.025317 | 0.025317 | ±0.000000 | 1 | ✓ |
| 1.67e-04 | 0.025713 | 0.025713 | ±0.000000 | 1 | ✓ |
| 2.78e-06 | 0.031002 | 0.031002 | ±0.000000 | 1 | ✓ |
| 5.99e-05 | 0.032772 | 0.032772 | ±0.000000 | 1 | ✓ |
| 2.15e-05 | 0.032788 | 0.032788 | ±0.000000 | 1 | ✓ |
| 7.74e-06 | 0.033191 | 0.033191 | ±0.000000 | 1 | |
| 4.64e-04 | 0.033252 | 0.033252 | ±0.000000 | 1 | |
| 1.29e-03 | 0.033550 | 0.033550 | ±0.000000 | 1 | |
| 3.59e-03 | 0.050059 | 0.050059 | ±0.000000 | 1 | |
| 1.00e-02 | 0.071303 | 0.071303 | ±0.000000 | 1 |
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Winner |
|---|---|---|---|---|---|
| 1.67e-04 | 0.017077 | 0.016965 | ±0.001962 | 3 | ★ |
| 5.99e-05 | 0.021077 | 0.024314 | ±0.005938 | 3 | |
| 1.00e-06 | 0.025465 | 0.026300 | ±0.001287 | 3 | |
| 2.15e-05 | 0.025921 | 0.023970 | ±0.006943 | 3 | |
| 2.78e-06 | 0.031002 | 0.030958 | ±0.000194 | 3 |
Selected LR: 2.15e-05
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Survivor |
|---|---|---|---|---|---|
| 2.78e-06 | 0.066489 | 0.066489 | ±0.000000 | 1 | ✓ |
| 1.29e-03 | 0.067065 | 0.067065 | ±0.000000 | 1 | ✓ |
| 7.74e-06 | 0.067810 | 0.067810 | ±0.000000 | 1 | ✓ |
| 2.15e-05 | 0.068154 | 0.068154 | ±0.000000 | 1 | ✓ |
| 4.64e-04 | 0.069257 | 0.069257 | ±0.000000 | 1 | ✓ |
| 1.00e-06 | 0.070110 | 0.070110 | ±0.000000 | 1 | |
| 5.99e-05 | 0.071434 | 0.071434 | ±0.000000 | 1 | |
| 3.59e-03 | 0.075976 | 0.075976 | ±0.000000 | 1 | |
| 1.67e-04 | 0.077022 | 0.077022 | ±0.000000 | 1 | |
| 1.00e-02 | 0.089898 | 0.089898 | ±0.000000 | 1 |
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Winner |
|---|---|---|---|---|---|
| 2.15e-05 | 0.055057 | 0.052492 | ±0.012177 | 3 | ★ |
| 7.74e-06 | 0.056518 | 0.053484 | ±0.012039 | 3 | |
| 4.64e-04 | 0.056892 | 0.056365 | ±0.013868 | 3 | |
| 2.78e-06 | 0.060483 | 0.055129 | ±0.012070 | 3 | |
| 1.29e-03 | 0.061592 | 0.057712 | ±0.005594 | 3 |
Selected LR: 5.99e-05
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Survivor |
|---|---|---|---|---|---|
| 4.64e-04 | 0.031574 | 0.031574 | ±0.000000 | 1 | ✓ |
| 2.15e-05 | 0.032761 | 0.032761 | ±0.000000 | 1 | ✓ |
| 5.99e-05 | 0.032928 | 0.032928 | ±0.000000 | 1 | ✓ |
| 1.29e-03 | 0.033036 | 0.033036 | ±0.000000 | 1 | ✓ |
| 7.74e-06 | 0.034420 | 0.034420 | ±0.000000 | 1 | ✓ |
| 2.78e-06 | 0.038651 | 0.038651 | ±0.000000 | 1 | |
| 1.67e-04 | 0.038768 | 0.038768 | ±0.000000 | 1 | |
| 1.00e-06 | 0.043813 | 0.043813 | ±0.000000 | 1 | |
| 3.59e-03 | 0.047352 | 0.047352 | ±0.000000 | 1 | |
| 1.00e-02 | 0.101314 | 0.101314 | ±0.000000 | 1 |
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Winner |
|---|---|---|---|---|---|
| 5.99e-05 | 0.030960 | 0.029528 | ±0.002364 | 3 | ★ |
| 2.15e-05 | 0.032761 | 0.033029 | ±0.000731 | 3 | |
| 7.74e-06 | 0.034420 | 0.033584 | ±0.002129 | 3 | |
| 1.29e-03 | 0.035592 | 0.040248 | ±0.007925 | 3 | |
| 4.64e-04 | 0.040910 | 0.040680 | ±0.001573 | 3 |
Selected LR: 1.67e-04
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Survivor |
|---|---|---|---|---|---|
| 1.67e-04 | 0.033357 | 0.033357 | ±0.000000 | 1 | ✓ |
| 4.64e-04 | 0.037095 | 0.037095 | ±0.000000 | 1 | ✓ |
| 5.99e-05 | 0.037805 | 0.037805 | ±0.000000 | 1 | ✓ |
| 1.29e-03 | 0.038435 | 0.038435 | ±0.000000 | 1 | ✓ |
| 2.15e-05 | 0.038480 | 0.038480 | ±0.000000 | 1 | ✓ |
| 7.74e-06 | 0.041641 | 0.041641 | ±0.000000 | 1 | |
| 2.78e-06 | 0.044177 | 0.044177 | ±0.000000 | 1 | |
| 1.00e-06 | 0.047554 | 0.047554 | ±0.000000 | 1 | |
| 3.59e-03 | 0.085943 | 0.085943 | ±0.000000 | 1 | |
| 1.00e-02 | 0.088045 | 0.088045 | ±0.000000 | 1 |
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Winner |
|---|---|---|---|---|---|
| 1.67e-04 | 0.033847 | 0.033366 | ±0.002508 | 3 | ★ |
| 5.99e-05 | 0.034530 | 0.033992 | ±0.001320 | 3 | |
| 4.64e-04 | 0.035849 | 0.036959 | ±0.002778 | 3 | |
| 1.29e-03 | 0.038204 | 0.035947 | ±0.003310 | 3 | |
| 2.15e-05 | 0.038480 | 0.036941 | ±0.003464 | 3 |
Selected LR: 1.67e-04
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Survivor |
|---|---|---|---|---|---|
| 4.64e-04 | 0.032540 | 0.032540 | ±0.000000 | 1 | ✓ |
| 5.99e-05 | 0.032713 | 0.032713 | ±0.000000 | 1 | ✓ |
| 2.15e-05 | 0.033736 | 0.033736 | ±0.000000 | 1 | ✓ |
| 1.67e-04 | 0.034172 | 0.034172 | ±0.000000 | 1 | ✓ |
| 7.74e-06 | 0.035609 | 0.035609 | ±0.000000 | 1 | ✓ |
| 2.78e-06 | 0.037727 | 0.037727 | ±0.000000 | 1 | |
| 1.00e-06 | 0.040771 | 0.040771 | ±0.000000 | 1 | |
| 3.59e-03 | 0.041225 | 0.041225 | ±0.000000 | 1 | |
| 1.29e-03 | 0.041476 | 0.041476 | ±0.000000 | 1 | |
| 1.00e-02 | 0.071526 | 0.071526 | ±0.000000 | 1 |
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Winner |
|---|---|---|---|---|---|
| 1.67e-04 | 0.026746 | 0.027829 | ±0.002081 | 3 | ★ |
| 5.99e-05 | 0.030441 | 0.031417 | ±0.001454 | 3 | |
| 2.15e-05 | 0.033736 | 0.033856 | ±0.001606 | 3 | |
| 7.74e-06 | 0.035609 | 0.034373 | ±0.002326 | 3 | |
| 4.64e-04 | 0.036166 | 0.038453 | ±0.004176 | 3 |
Selected LR: 1.67e-04
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Survivor |
|---|---|---|---|---|---|
| 1.29e-03 | 0.034529 | 0.034529 | ±0.000000 | 1 | ✓ |
| 1.67e-04 | 0.034961 | 0.034961 | ±0.000000 | 1 | ✓ |
| 4.64e-04 | 0.035254 | 0.035254 | ±0.000000 | 1 | ✓ |
| 5.99e-05 | 0.035299 | 0.035299 | ±0.000000 | 1 | ✓ |
| 2.15e-05 | 0.035636 | 0.035636 | ±0.000000 | 1 | ✓ |
| 7.74e-06 | 0.039350 | 0.039350 | ±0.000000 | 1 | |
| 2.78e-06 | 0.043347 | 0.043347 | ±0.000000 | 1 | |
| 1.00e-06 | 0.047038 | 0.047038 | ±0.000000 | 1 | |
| 3.59e-03 | 0.058950 | 0.058950 | ±0.000000 | 1 | |
| 1.00e-02 | 0.092584 | 0.092584 | ±0.000000 | 1 |
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Winner |
|---|---|---|---|---|---|
| 1.67e-04 | 0.033787 | 0.033880 | ±0.000140 | 3 | ★ |
| 4.64e-04 | 0.034986 | 0.036233 | ±0.003347 | 3 | |
| 2.15e-05 | 0.035636 | 0.034918 | ±0.002920 | 3 | |
| 5.99e-05 | 0.036204 | 0.036329 | ±0.000535 | 3 | |
| 1.29e-03 | 0.038523 | 0.039151 | ±0.001572 | 3 |
Selected LR: 1.67e-04
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Survivor |
|---|---|---|---|---|---|
| 4.64e-04 | 0.036236 | 0.036236 | ±0.000000 | 1 | ✓ |
| 1.67e-04 | 0.037908 | 0.037908 | ±0.000000 | 1 | ✓ |
| 5.99e-05 | 0.038261 | 0.038261 | ±0.000000 | 1 | ✓ |
| 2.15e-05 | 0.040733 | 0.040733 | ±0.000000 | 1 | ✓ |
| 7.74e-06 | 0.041718 | 0.041718 | ±0.000000 | 1 | ✓ |
| 3.59e-03 | 0.042335 | 0.042335 | ±0.000000 | 1 | |
| 2.78e-06 | 0.044474 | 0.044474 | ±0.000000 | 1 | |
| 1.00e-06 | 0.045113 | 0.045113 | ±0.000000 | 1 | |
| 1.29e-03 | 0.046457 | 0.046457 | ±0.000000 | 1 | |
| 1.00e-02 | 0.067111 | 0.067111 | ±0.000000 | 1 |
| Learning Rate | Median Loss | Mean Loss | Std Dev | Seeds | Winner |
|---|---|---|---|---|---|
| 1.67e-04 | 0.030360 | 0.030857 | ±0.000795 | 3 | ★ |
| 5.99e-05 | 0.031840 | 0.031721 | ±0.000385 | 3 | |
| 7.74e-06 | 0.032993 | 0.032672 | ±0.004874 | 3 | |
| 4.64e-04 | 0.033089 | 0.033610 | ±0.001520 | 3 | |
| 2.15e-05 | 0.033104 | 0.033880 | ±0.002000 | 3 |
| Stage | Best Loss | Stop Reason | Samples Trained | Time |
|---|---|---|---|---|
| Stage 1 | 0.014492 | sample_budget (29 frames x 100000 = 2,900,000) | 9,500 | 00:07:12 |
| Stage 2 | 0.013782 | sample_budget (59 frames x 100000 = 5,900,000) | 5,000 | 00:05:20 |
| Stage 3 | 0.015400 | sample_budget (89 frames x 100000 = 8,900,000) | 4,500 | 00:04:46 |
| Stage 4 | 0.017583 | sample_budget (119 frames x 100000 = 11,900,000) | 4,500 | 00:04:44 |
| Stage 5 | 0.053134 | divergence | 7,000 | 00:05:58 |
| Stage 6 | 0.054374 | sample_budget (179 frames x 100000 = 17,900,000) | 4,000 | 00:04:12 |
| Stage 7 | 0.050990 | sample_budget (209 frames x 100000 = 20,900,000) | 4,500 | 00:04:49 |
| Stage 8 | 0.033792 | sample_budget (239 frames x 100000 = 23,900,000) | 3,500 | 00:03:42 |
| Stage 9 | 0.040510 | sample_budget (269 frames x 100000 = 26,900,000) | 4,500 | 00:04:42 |
| Stage 10 | 0.040081 | sample_budget (302 frames x 100000 = 30,200,000) | 3,500 | 00:03:45 |
| Stage | Orig Loss | Train Loss | Time | Samples | Stop Reason |
|---|---|---|---|---|---|
| 1 | 0.122660 | 0.014492 | 00:07:12 | 9500 | sample_budget (29 frames x 100000 = 2,900,000) |
| 2 | 0.102963 | 0.013782 | 00:05:20 | 5000 | sample_budget (59 frames x 100000 = 5,900,000) |
| 3 | 0.094348 | 0.015400 | 00:04:46 | 4500 | sample_budget (89 frames x 100000 = 8,900,000) |
| 4 | 0.078558 | 0.017583 | 00:04:44 | 4500 | sample_budget (119 frames x 100000 = 11,900,000) |
| 5 | 0.072847 | 0.053134 | 00:05:58 | 7000 | divergence |
| 6 | 0.069513 | 0.054374 | 00:04:12 | 4000 | sample_budget (179 frames x 100000 = 17,900,000) |
| 7 | 0.057033 | 0.050990 | 00:04:49 | 4500 | sample_budget (209 frames x 100000 = 20,900,000) |
| 8 | 0.055213 | 0.033792 | 00:03:42 | 3500 | sample_budget (239 frames x 100000 = 23,900,000) |
| 9 | 0.054252 | 0.040510 | 00:04:42 | 4500 | sample_budget (269 frames x 100000 = 26,900,000) |
| 10 ⭐ | 0.051771 | 0.040081 | 00:03:45 | 3500 | sample_budget (302 frames x 100000 = 30,200,000) |